Code Tutoiral#
import torch
import IPython.display as ipd
sr = 44100
duration = 5
audio_sample = torch.randn(1, sr * duration)
ipd.Audio(audio_sample.numpy(), rate=sr)
Stable Audio Open Tutorial#
Stable Audio Open is fully avaiable through HuggingFace. To run Stable Audio Open locally, you’ll first need to generate a $HF_TOKEN for yourself, which can be done here https://huggingface.co/docs/huggingface_hub/en/quick-start#authentication (which you will first need a HuggingFace account for). Once you generate the token, you should export it as an environment variable with a bash command like
export HF_TOKEN="YOUR_HF_TOKEN"
The rest of the tutorial very much follows the demo design of the public Stable Audio Open resources:
First, we’ll install some dependencies if you don’t already have them. Stable-Audio-Tools can be a bit finnicky to install directly, so we suggest making a dedicated virtual envinroment (and not conda) to run this notebook.
!pip install torch torchaudio torchvision stable-audio-tools einops
Requirement already satisfied: torch in /Users/jongwook/.virtualenvs/openai/lib/python3.11/site-packages (2.4.1+openai.d84b6ca395c.branchsuffix.oai.cuda.missing.os.unknown.builderversion.4)
Requirement already satisfied: torchaudio in /Users/jongwook/.virtualenvs/openai/lib/python3.11/site-packages (2.4.1+torch.commit.d84b6ca395c.cuda.missing.os.unknown.builderversion.4)
Requirement already satisfied: torchvision in /Users/jongwook/.virtualenvs/openai/lib/python3.11/site-packages (0.19.1+torch.commit.d84b6ca395c.cuda.missing.os.unknown.builderversion.4)
Requirement already satisfied: stable-audio-tools in /Users/jongwook/.virtualenvs/openai/lib/python3.11/site-packages (0.0.16)
Requirement already satisfied: einops in /Users/jongwook/.virtualenvs/openai/lib/python3.11/site-packages (0.7.0)
Requirement already satisfied: filelock in /Users/jongwook/.virtualenvs/openai/lib/python3.11/site-packages (from torch) (3.15.4)
Requirement already satisfied: typing-extensions>=4.8.0 in /Users/jongwook/.virtualenvs/openai/lib/python3.11/site-packages (from torch) (4.12.2)
Requirement already satisfied: sympy in /Users/jongwook/.virtualenvs/openai/lib/python3.11/site-packages (from torch) (1.13.1)
Requirement already satisfied: networkx in /Users/jongwook/.virtualenvs/openai/lib/python3.11/site-packages (from torch) (3.3)
Requirement already satisfied: jinja2 in /Users/jongwook/.virtualenvs/openai/lib/python3.11/site-packages (from torch) (3.1.4)
Requirement already satisfied: fsspec in /Users/jongwook/.virtualenvs/openai/lib/python3.11/site-packages (from torch) (2024.10.0)
Requirement already satisfied: numpy in /Users/jongwook/.virtualenvs/openai/lib/python3.11/site-packages (from torchvision) (1.23.5)
Requirement already satisfied: pillow!=8.3.*,>=5.3.0 in /Users/jongwook/.virtualenvs/openai/lib/python3.11/site-packages (from torchvision) (10.4.0)
Requirement already satisfied: aeiou==0.0.20 in /Users/jongwook/.virtualenvs/openai/lib/python3.11/site-packages (from stable-audio-tools) (0.0.20)
Requirement already satisfied: alias-free-torch==0.0.6 in /Users/jongwook/.virtualenvs/openai/lib/python3.11/site-packages (from stable-audio-tools) (0.0.6)
Requirement already satisfied: auraloss==0.4.0 in /Users/jongwook/.virtualenvs/openai/lib/python3.11/site-packages (from stable-audio-tools) (0.4.0)
Requirement already satisfied: descript-audio-codec==1.0.0 in /Users/jongwook/.virtualenvs/openai/lib/python3.11/site-packages (from stable-audio-tools) (1.0.0)
Requirement already satisfied: einops-exts==0.0.4 in /Users/jongwook/.virtualenvs/openai/lib/python3.11/site-packages (from stable-audio-tools) (0.0.4)
Requirement already satisfied: ema-pytorch==0.2.3 in /Users/jongwook/.virtualenvs/openai/lib/python3.11/site-packages (from stable-audio-tools) (0.2.3)
Requirement already satisfied: encodec==0.1.1 in /Users/jongwook/.virtualenvs/openai/lib/python3.11/site-packages (from stable-audio-tools) (0.1.1)
Requirement already satisfied: gradio>=3.42.0 in /Users/jongwook/.virtualenvs/openai/lib/python3.11/site-packages (from stable-audio-tools) (5.5.0)
Requirement already satisfied: huggingface-hub in /Users/jongwook/.virtualenvs/openai/lib/python3.11/site-packages (from stable-audio-tools) (0.26.2)
Requirement already satisfied: importlib-resources==5.12.0 in /Users/jongwook/.virtualenvs/openai/lib/python3.11/site-packages (from stable-audio-tools) (5.12.0)
Requirement already satisfied: k-diffusion==0.1.1 in /Users/jongwook/.virtualenvs/openai/lib/python3.11/site-packages (from stable-audio-tools) (0.1.1)
Requirement already satisfied: laion-clap==1.1.4 in /Users/jongwook/.virtualenvs/openai/lib/python3.11/site-packages (from stable-audio-tools) (1.1.4)
Requirement already satisfied: local-attention==1.8.6 in /Users/jongwook/.virtualenvs/openai/lib/python3.11/site-packages (from stable-audio-tools) (1.8.6)
Requirement already satisfied: pandas==2.0.2 in /Users/jongwook/.virtualenvs/openai/lib/python3.11/site-packages (from stable-audio-tools) (2.0.2)
Requirement already satisfied: pedalboard==0.7.4 in /Users/jongwook/.virtualenvs/openai/lib/python3.11/site-packages (from stable-audio-tools) (0.7.4)
Requirement already satisfied: prefigure==0.0.9 in /Users/jongwook/.virtualenvs/openai/lib/python3.11/site-packages (from stable-audio-tools) (0.0.9)
Requirement already satisfied: pytorch-lightning==2.1.0 in /Users/jongwook/.virtualenvs/openai/lib/python3.11/site-packages (from stable-audio-tools) (2.1.0)
Requirement already satisfied: PyWavelets==1.4.1 in /Users/jongwook/.virtualenvs/openai/lib/python3.11/site-packages (from stable-audio-tools) (1.4.1)
Requirement already satisfied: safetensors in /Users/jongwook/.virtualenvs/openai/lib/python3.11/site-packages (from stable-audio-tools) (0.4.4)
Requirement already satisfied: sentencepiece==0.1.99 in /Users/jongwook/.virtualenvs/openai/lib/python3.11/site-packages (from stable-audio-tools) (0.1.99)
Requirement already satisfied: s3fs in /Users/jongwook/.virtualenvs/openai/lib/python3.11/site-packages (from stable-audio-tools) (2024.10.0)
Requirement already satisfied: torchmetrics==0.11.4 in /Users/jongwook/.virtualenvs/openai/lib/python3.11/site-packages (from stable-audio-tools) (0.11.4)
Requirement already satisfied: tqdm in /Users/jongwook/.virtualenvs/openai/lib/python3.11/site-packages (from stable-audio-tools) (4.66.5)
Requirement already satisfied: transformers in /Users/jongwook/.virtualenvs/openai/lib/python3.11/site-packages (from stable-audio-tools) (4.44.0)
Requirement already satisfied: v-diffusion-pytorch==0.0.2 in /Users/jongwook/.virtualenvs/openai/lib/python3.11/site-packages (from stable-audio-tools) (0.0.2)
Requirement already satisfied: vector-quantize-pytorch==1.9.14 in /Users/jongwook/.virtualenvs/openai/lib/python3.11/site-packages (from stable-audio-tools) (1.9.14)
Requirement already satisfied: wandb==0.15.4 in /Users/jongwook/.virtualenvs/openai/lib/python3.11/site-packages (from stable-audio-tools) (0.15.4)
Requirement already satisfied: webdataset==0.2.48 in /Users/jongwook/.virtualenvs/openai/lib/python3.11/site-packages (from stable-audio-tools) (0.2.48)
Requirement already satisfied: x-transformers<1.27.0 in /Users/jongwook/.virtualenvs/openai/lib/python3.11/site-packages (from stable-audio-tools) (1.26.6)
Requirement already satisfied: fastcore in /Users/jongwook/.virtualenvs/openai/lib/python3.11/site-packages (from aeiou==0.0.20->stable-audio-tools) (1.7.19)
Requirement already satisfied: plotly in /Users/jongwook/.virtualenvs/openai/lib/python3.11/site-packages (from aeiou==0.0.20->stable-audio-tools) (5.23.0)
Requirement already satisfied: bokeh in /Users/jongwook/.virtualenvs/openai/lib/python3.11/site-packages (from aeiou==0.0.20->stable-audio-tools) (3.6.1)
Requirement already satisfied: holoviews in /Users/jongwook/.virtualenvs/openai/lib/python3.11/site-packages (from aeiou==0.0.20->stable-audio-tools) (1.20.0)
Requirement already satisfied: scipy in /Users/jongwook/.virtualenvs/openai/lib/python3.11/site-packages (from aeiou==0.0.20->stable-audio-tools) (1.14.0)
Requirement already satisfied: matplotlib in /Users/jongwook/.virtualenvs/openai/lib/python3.11/site-packages (from aeiou==0.0.20->stable-audio-tools) (3.9.2)
Requirement already satisfied: librosa>=0.8.1 in /Users/jongwook/.virtualenvs/openai/lib/python3.11/site-packages (from aeiou==0.0.20->stable-audio-tools) (0.9.2)
Requirement already satisfied: ipython in /Users/jongwook/.virtualenvs/openai/lib/python3.11/site-packages (from aeiou==0.0.20->stable-audio-tools) (8.26.0)
Requirement already satisfied: accelerate in /Users/jongwook/.virtualenvs/openai/lib/python3.11/site-packages (from aeiou==0.0.20->stable-audio-tools) (1.1.1)
Requirement already satisfied: soundfile<=0.10.2 in /Users/jongwook/.virtualenvs/openai/lib/python3.11/site-packages (from aeiou==0.0.20->stable-audio-tools) (0.10.2)
Requirement already satisfied: umap-learn in /Users/jongwook/.virtualenvs/openai/lib/python3.11/site-packages (from aeiou==0.0.20->stable-audio-tools) (0.5.7)
Requirement already satisfied: argbind>=0.3.7 in /Users/jongwook/.virtualenvs/openai/lib/python3.11/site-packages (from descript-audio-codec==1.0.0->stable-audio-tools) (0.3.9)
Requirement already satisfied: descript-audiotools>=0.7.2 in /Users/jongwook/.virtualenvs/openai/lib/python3.11/site-packages (from descript-audio-codec==1.0.0->stable-audio-tools) (0.7.2)
Requirement already satisfied: clean-fid in /Users/jongwook/.virtualenvs/openai/lib/python3.11/site-packages (from k-diffusion==0.1.1->stable-audio-tools) (0.1.35)
Requirement already satisfied: clip-anytorch in /Users/jongwook/.virtualenvs/openai/lib/python3.11/site-packages (from k-diffusion==0.1.1->stable-audio-tools) (2.6.0)
Requirement already satisfied: dctorch in /Users/jongwook/.virtualenvs/openai/lib/python3.11/site-packages (from k-diffusion==0.1.1->stable-audio-tools) (0.1.2)
Requirement already satisfied: jsonmerge in /Users/jongwook/.virtualenvs/openai/lib/python3.11/site-packages (from k-diffusion==0.1.1->stable-audio-tools) (1.9.2)
Requirement already satisfied: kornia in /Users/jongwook/.virtualenvs/openai/lib/python3.11/site-packages (from k-diffusion==0.1.1->stable-audio-tools) (0.7.4)
Requirement already satisfied: scikit-image in /Users/jongwook/.virtualenvs/openai/lib/python3.11/site-packages (from k-diffusion==0.1.1->stable-audio-tools) (0.24.0)
Requirement already satisfied: torchdiffeq in /Users/jongwook/.virtualenvs/openai/lib/python3.11/site-packages (from k-diffusion==0.1.1->stable-audio-tools) (0.2.4)
Requirement already satisfied: torchsde in /Users/jongwook/.virtualenvs/openai/lib/python3.11/site-packages (from k-diffusion==0.1.1->stable-audio-tools) (0.2.6)
Requirement already satisfied: torchlibrosa in /Users/jongwook/.virtualenvs/openai/lib/python3.11/site-packages (from laion-clap==1.1.4->stable-audio-tools) (0.1.0)
Requirement already satisfied: ftfy in /Users/jongwook/.virtualenvs/openai/lib/python3.11/site-packages (from laion-clap==1.1.4->stable-audio-tools) (6.2.0)
Requirement already satisfied: braceexpand in /Users/jongwook/.virtualenvs/openai/lib/python3.11/site-packages (from laion-clap==1.1.4->stable-audio-tools) (0.1.7)
Requirement already satisfied: wget in /Users/jongwook/.virtualenvs/openai/lib/python3.11/site-packages (from laion-clap==1.1.4->stable-audio-tools) (3.2)
Requirement already satisfied: llvmlite in /Users/jongwook/.virtualenvs/openai/lib/python3.11/site-packages (from laion-clap==1.1.4->stable-audio-tools) (0.43.0)
Requirement already satisfied: scikit-learn in /Users/jongwook/.virtualenvs/openai/lib/python3.11/site-packages (from laion-clap==1.1.4->stable-audio-tools) (1.5.2)
Requirement already satisfied: h5py in /Users/jongwook/.virtualenvs/openai/lib/python3.11/site-packages (from laion-clap==1.1.4->stable-audio-tools) (3.11.0)
Requirement already satisfied: regex in /Users/jongwook/.virtualenvs/openai/lib/python3.11/site-packages (from laion-clap==1.1.4->stable-audio-tools) (2024.9.11)
Requirement already satisfied: progressbar in /Users/jongwook/.virtualenvs/openai/lib/python3.11/site-packages (from laion-clap==1.1.4->stable-audio-tools) (2.5)
Requirement already satisfied: python-dateutil>=2.8.2 in /Users/jongwook/.virtualenvs/openai/lib/python3.11/site-packages (from pandas==2.0.2->stable-audio-tools) (2.9.0.post0)
Requirement already satisfied: pytz>=2020.1 in /Users/jongwook/.virtualenvs/openai/lib/python3.11/site-packages (from pandas==2.0.2->stable-audio-tools) (2024.1)
Requirement already satisfied: tzdata>=2022.1 in /Users/jongwook/.virtualenvs/openai/lib/python3.11/site-packages (from pandas==2.0.2->stable-audio-tools) (2024.1)
Collecting argparse (from prefigure==0.0.9->stable-audio-tools)
Using cached argparse-1.4.0-py2.py3-none-any.whl.metadata (2.8 kB)
Requirement already satisfied: configparser in /Users/jongwook/.virtualenvs/openai/lib/python3.11/site-packages (from prefigure==0.0.9->stable-audio-tools) (7.1.0)
Requirement already satisfied: gin-config in /Users/jongwook/.virtualenvs/openai/lib/python3.11/site-packages (from prefigure==0.0.9->stable-audio-tools) (0.5.0)
Requirement already satisfied: PyYAML>=5.4 in /Users/jongwook/.virtualenvs/openai/lib/python3.11/site-packages (from pytorch-lightning==2.1.0->stable-audio-tools) (6.0.2)
Requirement already satisfied: packaging>=20.0 in /Users/jongwook/.virtualenvs/openai/lib/python3.11/site-packages (from pytorch-lightning==2.1.0->stable-audio-tools) (24.1)
Requirement already satisfied: lightning-utilities>=0.8.0 in /Users/jongwook/.virtualenvs/openai/lib/python3.11/site-packages (from pytorch-lightning==2.1.0->stable-audio-tools) (0.11.8)
Requirement already satisfied: requests in /Users/jongwook/.virtualenvs/openai/lib/python3.11/site-packages (from v-diffusion-pytorch==0.0.2->stable-audio-tools) (2.32.3)
Requirement already satisfied: Click!=8.0.0,>=7.0 in /Users/jongwook/.virtualenvs/openai/lib/python3.11/site-packages (from wandb==0.15.4->stable-audio-tools) (8.1.7)
Requirement already satisfied: GitPython!=3.1.29,>=1.0.0 in /Users/jongwook/.virtualenvs/openai/lib/python3.11/site-packages (from wandb==0.15.4->stable-audio-tools) (3.1.43)
Requirement already satisfied: psutil>=5.0.0 in /Users/jongwook/.virtualenvs/openai/lib/python3.11/site-packages (from wandb==0.15.4->stable-audio-tools) (6.0.0)
Requirement already satisfied: sentry-sdk>=1.0.0 in /Users/jongwook/.virtualenvs/openai/lib/python3.11/site-packages (from wandb==0.15.4->stable-audio-tools) (2.12.0)
Requirement already satisfied: docker-pycreds>=0.4.0 in /Users/jongwook/.virtualenvs/openai/lib/python3.11/site-packages (from wandb==0.15.4->stable-audio-tools) (0.4.0)
Requirement already satisfied: pathtools in /Users/jongwook/.virtualenvs/openai/lib/python3.11/site-packages (from wandb==0.15.4->stable-audio-tools) (0.1.2)
Requirement already satisfied: setproctitle in /Users/jongwook/.virtualenvs/openai/lib/python3.11/site-packages (from wandb==0.15.4->stable-audio-tools) (1.3.3)
Requirement already satisfied: setuptools in /Users/jongwook/.virtualenvs/openai/lib/python3.11/site-packages (from wandb==0.15.4->stable-audio-tools) (69.5.1)
Requirement already satisfied: appdirs>=1.4.3 in /Users/jongwook/.virtualenvs/openai/lib/python3.11/site-packages (from wandb==0.15.4->stable-audio-tools) (1.4.4)
Requirement already satisfied: protobuf!=4.21.0,<5,>=3.19.0 in /Users/jongwook/.virtualenvs/openai/lib/python3.11/site-packages (from wandb==0.15.4->stable-audio-tools) (3.19.6)
Requirement already satisfied: aiofiles<24.0,>=22.0 in /Users/jongwook/.virtualenvs/openai/lib/python3.11/site-packages (from gradio>=3.42.0->stable-audio-tools) (23.2.1)
Requirement already satisfied: anyio<5.0,>=3.0 in /Users/jongwook/.virtualenvs/openai/lib/python3.11/site-packages (from gradio>=3.42.0->stable-audio-tools) (4.4.0)
Requirement already satisfied: fastapi<1.0,>=0.115.2 in /Users/jongwook/.virtualenvs/openai/lib/python3.11/site-packages (from gradio>=3.42.0->stable-audio-tools) (0.115.4)
Requirement already satisfied: ffmpy in /Users/jongwook/.virtualenvs/openai/lib/python3.11/site-packages (from gradio>=3.42.0->stable-audio-tools) (0.4.0)
Requirement already satisfied: gradio-client==1.4.2 in /Users/jongwook/.virtualenvs/openai/lib/python3.11/site-packages (from gradio>=3.42.0->stable-audio-tools) (1.4.2)
Requirement already satisfied: httpx>=0.24.1 in /Users/jongwook/.virtualenvs/openai/lib/python3.11/site-packages (from gradio>=3.42.0->stable-audio-tools) (0.27.0)
Requirement already satisfied: markupsafe~=2.0 in /Users/jongwook/.virtualenvs/openai/lib/python3.11/site-packages (from gradio>=3.42.0->stable-audio-tools) (2.1.5)
Requirement already satisfied: orjson~=3.0 in /Users/jongwook/.virtualenvs/openai/lib/python3.11/site-packages (from gradio>=3.42.0->stable-audio-tools) (3.10.6)
Requirement already satisfied: pydantic>=2.0 in /Users/jongwook/.virtualenvs/openai/lib/python3.11/site-packages (from gradio>=3.42.0->stable-audio-tools) (2.9.2)
Requirement already satisfied: pydub in /Users/jongwook/.virtualenvs/openai/lib/python3.11/site-packages (from gradio>=3.42.0->stable-audio-tools) (0.25.1)
Requirement already satisfied: python-multipart==0.0.12 in /Users/jongwook/.virtualenvs/openai/lib/python3.11/site-packages (from gradio>=3.42.0->stable-audio-tools) (0.0.12)
Requirement already satisfied: ruff>=0.2.2 in /Users/jongwook/.virtualenvs/openai/lib/python3.11/site-packages (from gradio>=3.42.0->stable-audio-tools) (0.7.3)
Requirement already satisfied: safehttpx<1.0,>=0.1.1 in /Users/jongwook/.virtualenvs/openai/lib/python3.11/site-packages (from gradio>=3.42.0->stable-audio-tools) (0.1.1)
Requirement already satisfied: semantic-version~=2.0 in /Users/jongwook/.virtualenvs/openai/lib/python3.11/site-packages (from gradio>=3.42.0->stable-audio-tools) (2.10.0)
Requirement already satisfied: starlette<1.0,>=0.40.0 in /Users/jongwook/.virtualenvs/openai/lib/python3.11/site-packages (from gradio>=3.42.0->stable-audio-tools) (0.41.2)
Requirement already satisfied: tomlkit==0.12.0 in /Users/jongwook/.virtualenvs/openai/lib/python3.11/site-packages (from gradio>=3.42.0->stable-audio-tools) (0.12.0)
Requirement already satisfied: typer<1.0,>=0.12 in /Users/jongwook/.virtualenvs/openai/lib/python3.11/site-packages (from gradio>=3.42.0->stable-audio-tools) (0.12.5)
Requirement already satisfied: uvicorn>=0.14.0 in /Users/jongwook/.virtualenvs/openai/lib/python3.11/site-packages (from gradio>=3.42.0->stable-audio-tools) (0.30.3)
Requirement already satisfied: websockets<13.0,>=10.0 in /Users/jongwook/.virtualenvs/openai/lib/python3.11/site-packages (from gradio-client==1.4.2->gradio>=3.42.0->stable-audio-tools) (12.0)
Requirement already satisfied: aiobotocore<3.0.0,>=2.5.4 in /Users/jongwook/.virtualenvs/openai/lib/python3.11/site-packages (from s3fs->stable-audio-tools) (2.15.2)
Requirement already satisfied: aiohttp!=4.0.0a0,!=4.0.0a1 in /Users/jongwook/.virtualenvs/openai/lib/python3.11/site-packages (from s3fs->stable-audio-tools) (3.9.5)
Requirement already satisfied: mpmath<1.4,>=1.1.0 in /Users/jongwook/.virtualenvs/openai/lib/python3.11/site-packages (from sympy->torch) (1.3.0)
Requirement already satisfied: tokenizers<0.20,>=0.19 in /Users/jongwook/.virtualenvs/openai/lib/python3.11/site-packages (from transformers->stable-audio-tools) (0.19.1)
Requirement already satisfied: botocore<1.35.37,>=1.35.16 in /Users/jongwook/.virtualenvs/openai/lib/python3.11/site-packages (from aiobotocore<3.0.0,>=2.5.4->s3fs->stable-audio-tools) (1.35.36)
Requirement already satisfied: wrapt<2.0.0,>=1.10.10 in /Users/jongwook/.virtualenvs/openai/lib/python3.11/site-packages (from aiobotocore<3.0.0,>=2.5.4->s3fs->stable-audio-tools) (1.16.0)
Requirement already satisfied: aioitertools<1.0.0,>=0.5.1 in /Users/jongwook/.virtualenvs/openai/lib/python3.11/site-packages (from aiobotocore<3.0.0,>=2.5.4->s3fs->stable-audio-tools) (0.11.0)
Requirement already satisfied: aiosignal>=1.1.2 in /Users/jongwook/.virtualenvs/openai/lib/python3.11/site-packages (from aiohttp!=4.0.0a0,!=4.0.0a1->s3fs->stable-audio-tools) (1.3.1)
Requirement already satisfied: attrs>=17.3.0 in /Users/jongwook/.virtualenvs/openai/lib/python3.11/site-packages (from aiohttp!=4.0.0a0,!=4.0.0a1->s3fs->stable-audio-tools) (23.2.0)
Requirement already satisfied: frozenlist>=1.1.1 in /Users/jongwook/.virtualenvs/openai/lib/python3.11/site-packages (from aiohttp!=4.0.0a0,!=4.0.0a1->s3fs->stable-audio-tools) (1.4.1)
Requirement already satisfied: multidict<7.0,>=4.5 in /Users/jongwook/.virtualenvs/openai/lib/python3.11/site-packages (from aiohttp!=4.0.0a0,!=4.0.0a1->s3fs->stable-audio-tools) (6.1.0)
Requirement already satisfied: yarl<2.0,>=1.0 in /Users/jongwook/.virtualenvs/openai/lib/python3.11/site-packages (from aiohttp!=4.0.0a0,!=4.0.0a1->s3fs->stable-audio-tools) (1.17.1)
Requirement already satisfied: idna>=2.8 in /Users/jongwook/.virtualenvs/openai/lib/python3.11/site-packages (from anyio<5.0,>=3.0->gradio>=3.42.0->stable-audio-tools) (3.7)
Requirement already satisfied: sniffio>=1.1 in /Users/jongwook/.virtualenvs/openai/lib/python3.11/site-packages (from anyio<5.0,>=3.0->gradio>=3.42.0->stable-audio-tools) (1.3.1)
Requirement already satisfied: docstring-parser in /Users/jongwook/.virtualenvs/openai/lib/python3.11/site-packages (from argbind>=0.3.7->descript-audio-codec==1.0.0->stable-audio-tools) (0.16)
Requirement already satisfied: pyloudnorm in /Users/jongwook/.virtualenvs/openai/lib/python3.11/site-packages (from descript-audiotools>=0.7.2->descript-audio-codec==1.0.0->stable-audio-tools) (0.1.1)
Requirement already satisfied: julius in /Users/jongwook/.virtualenvs/openai/lib/python3.11/site-packages (from descript-audiotools>=0.7.2->descript-audio-codec==1.0.0->stable-audio-tools) (0.2.7)
Requirement already satisfied: rich in /Users/jongwook/.virtualenvs/openai/lib/python3.11/site-packages (from descript-audiotools>=0.7.2->descript-audio-codec==1.0.0->stable-audio-tools) (13.9.3)
Requirement already satisfied: pystoi in /Users/jongwook/.virtualenvs/openai/lib/python3.11/site-packages (from descript-audiotools>=0.7.2->descript-audio-codec==1.0.0->stable-audio-tools) (0.4.1)
Requirement already satisfied: torch-stoi in /Users/jongwook/.virtualenvs/openai/lib/python3.11/site-packages (from descript-audiotools>=0.7.2->descript-audio-codec==1.0.0->stable-audio-tools) (0.2.3)
Requirement already satisfied: flatten-dict in /Users/jongwook/.virtualenvs/openai/lib/python3.11/site-packages (from descript-audiotools>=0.7.2->descript-audio-codec==1.0.0->stable-audio-tools) (0.4.2)
Requirement already satisfied: markdown2 in /Users/jongwook/.virtualenvs/openai/lib/python3.11/site-packages (from descript-audiotools>=0.7.2->descript-audio-codec==1.0.0->stable-audio-tools) (2.5.1)
Requirement already satisfied: randomname in /Users/jongwook/.virtualenvs/openai/lib/python3.11/site-packages (from descript-audiotools>=0.7.2->descript-audio-codec==1.0.0->stable-audio-tools) (0.2.1)
Requirement already satisfied: tensorboard in /Users/jongwook/.virtualenvs/openai/lib/python3.11/site-packages (from descript-audiotools>=0.7.2->descript-audio-codec==1.0.0->stable-audio-tools) (2.17.0)
Requirement already satisfied: six>=1.4.0 in /Users/jongwook/.virtualenvs/openai/lib/python3.11/site-packages (from docker-pycreds>=0.4.0->wandb==0.15.4->stable-audio-tools) (1.16.0)
Requirement already satisfied: gitdb<5,>=4.0.1 in /Users/jongwook/.virtualenvs/openai/lib/python3.11/site-packages (from GitPython!=3.1.29,>=1.0.0->wandb==0.15.4->stable-audio-tools) (4.0.11)
Requirement already satisfied: certifi in /Users/jongwook/.virtualenvs/openai/lib/python3.11/site-packages (from httpx>=0.24.1->gradio>=3.42.0->stable-audio-tools) (2024.7.4)
Requirement already satisfied: httpcore==1.* in /Users/jongwook/.virtualenvs/openai/lib/python3.11/site-packages (from httpx>=0.24.1->gradio>=3.42.0->stable-audio-tools) (1.0.5)
Requirement already satisfied: h11<0.15,>=0.13 in /Users/jongwook/.virtualenvs/openai/lib/python3.11/site-packages (from httpcore==1.*->httpx>=0.24.1->gradio>=3.42.0->stable-audio-tools) (0.14.0)
Requirement already satisfied: audioread>=2.1.9 in /Users/jongwook/.virtualenvs/openai/lib/python3.11/site-packages (from librosa>=0.8.1->aeiou==0.0.20->stable-audio-tools) (3.0.1)
Requirement already satisfied: joblib>=0.14 in /Users/jongwook/.virtualenvs/openai/lib/python3.11/site-packages (from librosa>=0.8.1->aeiou==0.0.20->stable-audio-tools) (1.4.2)
Requirement already satisfied: decorator>=4.0.10 in /Users/jongwook/.virtualenvs/openai/lib/python3.11/site-packages (from librosa>=0.8.1->aeiou==0.0.20->stable-audio-tools) (5.1.1)
Requirement already satisfied: resampy>=0.2.2 in /Users/jongwook/.virtualenvs/openai/lib/python3.11/site-packages (from librosa>=0.8.1->aeiou==0.0.20->stable-audio-tools) (0.4.3)
Requirement already satisfied: numba>=0.45.1 in /Users/jongwook/.virtualenvs/openai/lib/python3.11/site-packages (from librosa>=0.8.1->aeiou==0.0.20->stable-audio-tools) (0.60.0)
Requirement already satisfied: pooch>=1.0 in /Users/jongwook/.virtualenvs/openai/lib/python3.11/site-packages (from librosa>=0.8.1->aeiou==0.0.20->stable-audio-tools) (1.8.2)
Requirement already satisfied: annotated-types>=0.6.0 in /Users/jongwook/.virtualenvs/openai/lib/python3.11/site-packages (from pydantic>=2.0->gradio>=3.42.0->stable-audio-tools) (0.7.0)
Requirement already satisfied: pydantic-core==2.23.4 in /Users/jongwook/.virtualenvs/openai/lib/python3.11/site-packages (from pydantic>=2.0->gradio>=3.42.0->stable-audio-tools) (2.23.4)
Requirement already satisfied: charset-normalizer<4,>=2 in /Users/jongwook/.virtualenvs/openai/lib/python3.11/site-packages (from requests->v-diffusion-pytorch==0.0.2->stable-audio-tools) (3.3.2)
Requirement already satisfied: urllib3<3,>=1.21.1 in /Users/jongwook/.virtualenvs/openai/lib/python3.11/site-packages (from requests->v-diffusion-pytorch==0.0.2->stable-audio-tools) (2.2.2)
Requirement already satisfied: threadpoolctl>=3.1.0 in /Users/jongwook/.virtualenvs/openai/lib/python3.11/site-packages (from scikit-learn->laion-clap==1.1.4->stable-audio-tools) (3.5.0)
Requirement already satisfied: cffi>=1.0 in /Users/jongwook/.virtualenvs/openai/lib/python3.11/site-packages (from soundfile<=0.10.2->aeiou==0.0.20->stable-audio-tools) (1.16.0)
Requirement already satisfied: shellingham>=1.3.0 in /Users/jongwook/.virtualenvs/openai/lib/python3.11/site-packages (from typer<1.0,>=0.12->gradio>=3.42.0->stable-audio-tools) (1.5.4)
Requirement already satisfied: contourpy>=1.2 in /Users/jongwook/.virtualenvs/openai/lib/python3.11/site-packages (from bokeh->aeiou==0.0.20->stable-audio-tools) (1.2.1)
Requirement already satisfied: tornado>=6.2 in /Users/jongwook/.virtualenvs/openai/lib/python3.11/site-packages (from bokeh->aeiou==0.0.20->stable-audio-tools) (6.4.1)
Requirement already satisfied: xyzservices>=2021.09.1 in /Users/jongwook/.virtualenvs/openai/lib/python3.11/site-packages (from bokeh->aeiou==0.0.20->stable-audio-tools) (2024.9.0)
Requirement already satisfied: wcwidth<0.3.0,>=0.2.12 in /Users/jongwook/.virtualenvs/openai/lib/python3.11/site-packages (from ftfy->laion-clap==1.1.4->stable-audio-tools) (0.2.13)
Requirement already satisfied: colorcet in /Users/jongwook/.virtualenvs/openai/lib/python3.11/site-packages (from holoviews->aeiou==0.0.20->stable-audio-tools) (3.1.0)
Requirement already satisfied: panel>=1.0 in /Users/jongwook/.virtualenvs/openai/lib/python3.11/site-packages (from holoviews->aeiou==0.0.20->stable-audio-tools) (1.5.3)
Requirement already satisfied: param<3.0,>=2.0 in /Users/jongwook/.virtualenvs/openai/lib/python3.11/site-packages (from holoviews->aeiou==0.0.20->stable-audio-tools) (2.1.1)
Requirement already satisfied: pyviz-comms>=2.1 in /Users/jongwook/.virtualenvs/openai/lib/python3.11/site-packages (from holoviews->aeiou==0.0.20->stable-audio-tools) (3.0.3)
Requirement already satisfied: jedi>=0.16 in /Users/jongwook/.virtualenvs/openai/lib/python3.11/site-packages (from ipython->aeiou==0.0.20->stable-audio-tools) (0.19.1)
Requirement already satisfied: matplotlib-inline in /Users/jongwook/.virtualenvs/openai/lib/python3.11/site-packages (from ipython->aeiou==0.0.20->stable-audio-tools) (0.1.7)
Requirement already satisfied: prompt-toolkit<3.1.0,>=3.0.41 in /Users/jongwook/.virtualenvs/openai/lib/python3.11/site-packages (from ipython->aeiou==0.0.20->stable-audio-tools) (3.0.47)
Requirement already satisfied: pygments>=2.4.0 in /Users/jongwook/.virtualenvs/openai/lib/python3.11/site-packages (from ipython->aeiou==0.0.20->stable-audio-tools) (2.18.0)
Requirement already satisfied: stack-data in /Users/jongwook/.virtualenvs/openai/lib/python3.11/site-packages (from ipython->aeiou==0.0.20->stable-audio-tools) (0.6.3)
Requirement already satisfied: traitlets>=5.13.0 in /Users/jongwook/.virtualenvs/openai/lib/python3.11/site-packages (from ipython->aeiou==0.0.20->stable-audio-tools) (5.14.3)
Requirement already satisfied: pexpect>4.3 in /Users/jongwook/.virtualenvs/openai/lib/python3.11/site-packages (from ipython->aeiou==0.0.20->stable-audio-tools) (4.9.0)
Requirement already satisfied: jsonschema>2.4.0 in /Users/jongwook/.virtualenvs/openai/lib/python3.11/site-packages (from jsonmerge->k-diffusion==0.1.1->stable-audio-tools) (4.23.0)
Requirement already satisfied: kornia-rs>=0.1.0 in /Users/jongwook/.virtualenvs/openai/lib/python3.11/site-packages (from kornia->k-diffusion==0.1.1->stable-audio-tools) (0.1.7)
Requirement already satisfied: cycler>=0.10 in /Users/jongwook/.virtualenvs/openai/lib/python3.11/site-packages (from matplotlib->aeiou==0.0.20->stable-audio-tools) (0.12.1)
Requirement already satisfied: fonttools>=4.22.0 in /Users/jongwook/.virtualenvs/openai/lib/python3.11/site-packages (from matplotlib->aeiou==0.0.20->stable-audio-tools) (4.54.1)
Requirement already satisfied: kiwisolver>=1.3.1 in /Users/jongwook/.virtualenvs/openai/lib/python3.11/site-packages (from matplotlib->aeiou==0.0.20->stable-audio-tools) (1.4.7)
Requirement already satisfied: pyparsing>=2.3.1 in /Users/jongwook/.virtualenvs/openai/lib/python3.11/site-packages (from matplotlib->aeiou==0.0.20->stable-audio-tools) (3.1.2)
Requirement already satisfied: tenacity>=6.2.0 in /Users/jongwook/.virtualenvs/openai/lib/python3.11/site-packages (from plotly->aeiou==0.0.20->stable-audio-tools) (9.0.0)
Requirement already satisfied: imageio>=2.33 in /Users/jongwook/.virtualenvs/openai/lib/python3.11/site-packages (from scikit-image->k-diffusion==0.1.1->stable-audio-tools) (2.34.2)
Requirement already satisfied: tifffile>=2022.8.12 in /Users/jongwook/.virtualenvs/openai/lib/python3.11/site-packages (from scikit-image->k-diffusion==0.1.1->stable-audio-tools) (2024.9.20)
Requirement already satisfied: lazy-loader>=0.4 in /Users/jongwook/.virtualenvs/openai/lib/python3.11/site-packages (from scikit-image->k-diffusion==0.1.1->stable-audio-tools) (0.4)
Requirement already satisfied: trampoline>=0.1.2 in /Users/jongwook/.virtualenvs/openai/lib/python3.11/site-packages (from torchsde->k-diffusion==0.1.1->stable-audio-tools) (0.1.2)
Requirement already satisfied: pynndescent>=0.5 in /Users/jongwook/.virtualenvs/openai/lib/python3.11/site-packages (from umap-learn->aeiou==0.0.20->stable-audio-tools) (0.5.13)
Requirement already satisfied: jmespath<2.0.0,>=0.7.1 in /Users/jongwook/.virtualenvs/openai/lib/python3.11/site-packages (from botocore<1.35.37,>=1.35.16->aiobotocore<3.0.0,>=2.5.4->s3fs->stable-audio-tools) (1.0.1)
Requirement already satisfied: pycparser in /Users/jongwook/.virtualenvs/openai/lib/python3.11/site-packages (from cffi>=1.0->soundfile<=0.10.2->aeiou==0.0.20->stable-audio-tools) (2.22)
Requirement already satisfied: smmap<6,>=3.0.1 in /Users/jongwook/.virtualenvs/openai/lib/python3.11/site-packages (from gitdb<5,>=4.0.1->GitPython!=3.1.29,>=1.0.0->wandb==0.15.4->stable-audio-tools) (5.0.1)
Requirement already satisfied: parso<0.9.0,>=0.8.3 in /Users/jongwook/.virtualenvs/openai/lib/python3.11/site-packages (from jedi>=0.16->ipython->aeiou==0.0.20->stable-audio-tools) (0.8.4)
Requirement already satisfied: jsonschema-specifications>=2023.03.6 in /Users/jongwook/.virtualenvs/openai/lib/python3.11/site-packages (from jsonschema>2.4.0->jsonmerge->k-diffusion==0.1.1->stable-audio-tools) (2023.12.1)
Requirement already satisfied: referencing>=0.28.4 in /Users/jongwook/.virtualenvs/openai/lib/python3.11/site-packages (from jsonschema>2.4.0->jsonmerge->k-diffusion==0.1.1->stable-audio-tools) (0.35.1)
Requirement already satisfied: rpds-py>=0.7.1 in /Users/jongwook/.virtualenvs/openai/lib/python3.11/site-packages (from jsonschema>2.4.0->jsonmerge->k-diffusion==0.1.1->stable-audio-tools) (0.19.1)
Requirement already satisfied: bleach in /Users/jongwook/.virtualenvs/openai/lib/python3.11/site-packages (from panel>=1.0->holoviews->aeiou==0.0.20->stable-audio-tools) (6.1.0)
Requirement already satisfied: linkify-it-py in /Users/jongwook/.virtualenvs/openai/lib/python3.11/site-packages (from panel>=1.0->holoviews->aeiou==0.0.20->stable-audio-tools) (2.0.3)
Requirement already satisfied: markdown in /Users/jongwook/.virtualenvs/openai/lib/python3.11/site-packages (from panel>=1.0->holoviews->aeiou==0.0.20->stable-audio-tools) (3.6)
Requirement already satisfied: markdown-it-py in /Users/jongwook/.virtualenvs/openai/lib/python3.11/site-packages (from panel>=1.0->holoviews->aeiou==0.0.20->stable-audio-tools) (3.0.0)
Requirement already satisfied: mdit-py-plugins in /Users/jongwook/.virtualenvs/openai/lib/python3.11/site-packages (from panel>=1.0->holoviews->aeiou==0.0.20->stable-audio-tools) (0.4.2)
Requirement already satisfied: ptyprocess>=0.5 in /Users/jongwook/.virtualenvs/openai/lib/python3.11/site-packages (from pexpect>4.3->ipython->aeiou==0.0.20->stable-audio-tools) (0.7.0)
Requirement already satisfied: platformdirs>=2.5.0 in /Users/jongwook/.virtualenvs/openai/lib/python3.11/site-packages (from pooch>=1.0->librosa>=0.8.1->aeiou==0.0.20->stable-audio-tools) (4.2.2)
Requirement already satisfied: propcache>=0.2.0 in /Users/jongwook/.virtualenvs/openai/lib/python3.11/site-packages (from yarl<2.0,>=1.0->aiohttp!=4.0.0a0,!=4.0.0a1->s3fs->stable-audio-tools) (0.2.0)
Requirement already satisfied: future>=0.16.0 in /Users/jongwook/.virtualenvs/openai/lib/python3.11/site-packages (from pyloudnorm->descript-audiotools>=0.7.2->descript-audio-codec==1.0.0->stable-audio-tools) (1.0.0)
Requirement already satisfied: fire in /Users/jongwook/.virtualenvs/openai/lib/python3.11/site-packages (from randomname->descript-audiotools>=0.7.2->descript-audio-codec==1.0.0->stable-audio-tools) (0.4.0)
Requirement already satisfied: executing>=1.2.0 in /Users/jongwook/.virtualenvs/openai/lib/python3.11/site-packages (from stack-data->ipython->aeiou==0.0.20->stable-audio-tools) (2.0.1)
Requirement already satisfied: asttokens>=2.1.0 in /Users/jongwook/.virtualenvs/openai/lib/python3.11/site-packages (from stack-data->ipython->aeiou==0.0.20->stable-audio-tools) (2.4.1)
Requirement already satisfied: pure-eval in /Users/jongwook/.virtualenvs/openai/lib/python3.11/site-packages (from stack-data->ipython->aeiou==0.0.20->stable-audio-tools) (0.2.3)
Requirement already satisfied: absl-py>=0.4 in /Users/jongwook/.virtualenvs/openai/lib/python3.11/site-packages (from tensorboard->descript-audiotools>=0.7.2->descript-audio-codec==1.0.0->stable-audio-tools) (2.1.0)
Requirement already satisfied: grpcio>=1.48.2 in /Users/jongwook/.virtualenvs/openai/lib/python3.11/site-packages (from tensorboard->descript-audiotools>=0.7.2->descript-audio-codec==1.0.0->stable-audio-tools) (1.65.1)
Requirement already satisfied: tensorboard-data-server<0.8.0,>=0.7.0 in /Users/jongwook/.virtualenvs/openai/lib/python3.11/site-packages (from tensorboard->descript-audiotools>=0.7.2->descript-audio-codec==1.0.0->stable-audio-tools) (0.7.2)
Requirement already satisfied: werkzeug>=1.0.1 in /Users/jongwook/.virtualenvs/openai/lib/python3.11/site-packages (from tensorboard->descript-audiotools>=0.7.2->descript-audio-codec==1.0.0->stable-audio-tools) (2.2.3)
Requirement already satisfied: mdurl~=0.1 in /Users/jongwook/.virtualenvs/openai/lib/python3.11/site-packages (from markdown-it-py->panel>=1.0->holoviews->aeiou==0.0.20->stable-audio-tools) (0.1.2)
Requirement already satisfied: webencodings in /Users/jongwook/.virtualenvs/openai/lib/python3.11/site-packages (from bleach->panel>=1.0->holoviews->aeiou==0.0.20->stable-audio-tools) (0.5.1)
Requirement already satisfied: termcolor in /Users/jongwook/.virtualenvs/openai/lib/python3.11/site-packages (from fire->randomname->descript-audiotools>=0.7.2->descript-audio-codec==1.0.0->stable-audio-tools) (2.3.0)
Requirement already satisfied: uc-micro-py in /Users/jongwook/.virtualenvs/openai/lib/python3.11/site-packages (from linkify-it-py->panel>=1.0->holoviews->aeiou==0.0.20->stable-audio-tools) (1.0.3)
Using cached argparse-1.4.0-py2.py3-none-any.whl (23 kB)
Installing collected packages: argparse
Successfully installed argparse-1.4.0
[notice] A new release of pip is available: 24.1.2 -> 24.3.1
[notice] To update, run: pip install --upgrade pip
If running this locally, you can simply set the HF_TOKEN in your local environment (as done below). If you’re using a collab notebook, you first need to upload your HF_TOKEN as a “secret key” to your collab, and the below command won’t have any affect in that case.
import os
os.environ['HF_TOKEN'] = 'Your API key'
Next, we can load the model from huggingface. Note that there are some known dependency issues with stable-audio-tools on M1 Macs, so we recommend running this as a collab notebook (or on some linux system)
import torch
import torchaudio
# import librosa
from einops import rearrange
from stable_audio_tools import get_pretrained_model
from stable_audio_tools.inference.generation import generate_diffusion_cond
import IPython.display as ipd
from functools import partial
device = "cuda" if torch.cuda.is_available() else "cpu"
# Download model
model, model_config = get_pretrained_model("stabilityai/stable-audio-open-1.0")
sample_rate = model_config["sample_rate"]
sample_size = model_config["sample_size"]
model = model.to(device)
---------------------------------------------------------------------------
HTTPError Traceback (most recent call last)
File ~/.virtualenvs/openai/lib/python3.11/site-packages/huggingface_hub/utils/_http.py:406, in hf_raise_for_status(response, endpoint_name)
405 try:
--> 406 response.raise_for_status()
407 except HTTPError as e:
File ~/.virtualenvs/openai/lib/python3.11/site-packages/requests/models.py:1024, in Response.raise_for_status(self)
1023 if http_error_msg:
-> 1024 raise HTTPError(http_error_msg, response=self)
HTTPError: 401 Client Error: Unauthorized for url: https://huggingface.co/stabilityai/stable-audio-open-1.0/resolve/main/model_config.json
The above exception was the direct cause of the following exception:
GatedRepoError Traceback (most recent call last)
Cell In[4], line 13
10 device = "cuda" if torch.cuda.is_available() else "cpu"
12 # Download model
---> 13 model, model_config = get_pretrained_model("stabilityai/stable-audio-open-1.0")
14 sample_rate = model_config["sample_rate"]
15 sample_size = model_config["sample_size"]
File ~/.virtualenvs/openai/lib/python3.11/site-packages/stable_audio_tools/models/pretrained.py:10, in get_pretrained_model(name)
8 def get_pretrained_model(name: str):
---> 10 model_config_path = hf_hub_download(name, filename="model_config.json", repo_type='model')
12 with open(model_config_path) as f:
13 model_config = json.load(f)
File ~/.virtualenvs/openai/lib/python3.11/site-packages/huggingface_hub/utils/_validators.py:114, in validate_hf_hub_args.<locals>._inner_fn(*args, **kwargs)
111 if check_use_auth_token:
112 kwargs = smoothly_deprecate_use_auth_token(fn_name=fn.__name__, has_token=has_token, kwargs=kwargs)
--> 114 return fn(*args, **kwargs)
File ~/.virtualenvs/openai/lib/python3.11/site-packages/huggingface_hub/file_download.py:862, in hf_hub_download(repo_id, filename, subfolder, repo_type, revision, library_name, library_version, cache_dir, local_dir, user_agent, force_download, proxies, etag_timeout, token, local_files_only, headers, endpoint, resume_download, force_filename, local_dir_use_symlinks)
842 return _hf_hub_download_to_local_dir(
843 # Destination
844 local_dir=local_dir,
(...)
859 local_files_only=local_files_only,
860 )
861 else:
--> 862 return _hf_hub_download_to_cache_dir(
863 # Destination
864 cache_dir=cache_dir,
865 # File info
866 repo_id=repo_id,
867 filename=filename,
868 repo_type=repo_type,
869 revision=revision,
870 # HTTP info
871 endpoint=endpoint,
872 etag_timeout=etag_timeout,
873 headers=headers,
874 proxies=proxies,
875 token=token,
876 # Additional options
877 local_files_only=local_files_only,
878 force_download=force_download,
879 )
File ~/.virtualenvs/openai/lib/python3.11/site-packages/huggingface_hub/file_download.py:969, in _hf_hub_download_to_cache_dir(cache_dir, repo_id, filename, repo_type, revision, endpoint, etag_timeout, headers, proxies, token, local_files_only, force_download)
966 return pointer_path
968 # Otherwise, raise appropriate error
--> 969 _raise_on_head_call_error(head_call_error, force_download, local_files_only)
971 # From now on, etag, commit_hash, url and size are not None.
972 assert etag is not None, "etag must have been retrieved from server"
File ~/.virtualenvs/openai/lib/python3.11/site-packages/huggingface_hub/file_download.py:1484, in _raise_on_head_call_error(head_call_error, force_download, local_files_only)
1478 raise LocalEntryNotFoundError(
1479 "Cannot find the requested files in the disk cache and outgoing traffic has been disabled. To enable"
1480 " hf.co look-ups and downloads online, set 'local_files_only' to False."
1481 )
1482 elif isinstance(head_call_error, RepositoryNotFoundError) or isinstance(head_call_error, GatedRepoError):
1483 # Repo not found or gated => let's raise the actual error
-> 1484 raise head_call_error
1485 else:
1486 # Otherwise: most likely a connection issue or Hub downtime => let's warn the user
1487 raise LocalEntryNotFoundError(
1488 "An error happened while trying to locate the file on the Hub and we cannot find the requested files"
1489 " in the local cache. Please check your connection and try again or make sure your Internet connection"
1490 " is on."
1491 ) from head_call_error
File ~/.virtualenvs/openai/lib/python3.11/site-packages/huggingface_hub/file_download.py:1376, in _get_metadata_or_catch_error(repo_id, filename, repo_type, revision, endpoint, proxies, etag_timeout, headers, token, local_files_only, relative_filename, storage_folder)
1374 try:
1375 try:
-> 1376 metadata = get_hf_file_metadata(
1377 url=url, proxies=proxies, timeout=etag_timeout, headers=headers, token=token
1378 )
1379 except EntryNotFoundError as http_error:
1380 if storage_folder is not None and relative_filename is not None:
1381 # Cache the non-existence of the file
File ~/.virtualenvs/openai/lib/python3.11/site-packages/huggingface_hub/utils/_validators.py:114, in validate_hf_hub_args.<locals>._inner_fn(*args, **kwargs)
111 if check_use_auth_token:
112 kwargs = smoothly_deprecate_use_auth_token(fn_name=fn.__name__, has_token=has_token, kwargs=kwargs)
--> 114 return fn(*args, **kwargs)
File ~/.virtualenvs/openai/lib/python3.11/site-packages/huggingface_hub/file_download.py:1296, in get_hf_file_metadata(url, token, proxies, timeout, library_name, library_version, user_agent, headers)
1293 headers["Accept-Encoding"] = "identity" # prevent any compression => we want to know the real size of the file
1295 # Retrieve metadata
-> 1296 r = _request_wrapper(
1297 method="HEAD",
1298 url=url,
1299 headers=headers,
1300 allow_redirects=False,
1301 follow_relative_redirects=True,
1302 proxies=proxies,
1303 timeout=timeout,
1304 )
1305 hf_raise_for_status(r)
1307 # Return
File ~/.virtualenvs/openai/lib/python3.11/site-packages/huggingface_hub/file_download.py:277, in _request_wrapper(method, url, follow_relative_redirects, **params)
275 # Recursively follow relative redirects
276 if follow_relative_redirects:
--> 277 response = _request_wrapper(
278 method=method,
279 url=url,
280 follow_relative_redirects=False,
281 **params,
282 )
284 # If redirection, we redirect only relative paths.
285 # This is useful in case of a renamed repository.
286 if 300 <= response.status_code <= 399:
File ~/.virtualenvs/openai/lib/python3.11/site-packages/huggingface_hub/file_download.py:301, in _request_wrapper(method, url, follow_relative_redirects, **params)
299 # Perform request and return if status_code is not in the retry list.
300 response = get_session().request(method=method, url=url, **params)
--> 301 hf_raise_for_status(response)
302 return response
File ~/.virtualenvs/openai/lib/python3.11/site-packages/huggingface_hub/utils/_http.py:423, in hf_raise_for_status(response, endpoint_name)
419 elif error_code == "GatedRepo":
420 message = (
421 f"{response.status_code} Client Error." + "\n\n" + f"Cannot access gated repo for url {response.url}."
422 )
--> 423 raise _format(GatedRepoError, message, response) from e
425 elif error_message == "Access to this resource is disabled.":
426 message = (
427 f"{response.status_code} Client Error."
428 + "\n\n"
(...)
431 + "Access to this resource is disabled."
432 )
GatedRepoError: 401 Client Error. (Request ID: Root=1-6730935f-7ec17b9d4716cb7f5e6fbfc2;0d63d326-965c-4988-ba00-4f70b3601578)
Cannot access gated repo for url https://huggingface.co/stabilityai/stable-audio-open-1.0/resolve/main/model_config.json.
Access to model stabilityai/stable-audio-open-1.0 is restricted. You must have access to it and be authenticated to access it. Please log in.
First we’ll wrap the sampling code in a simpler wrapper, as there’s a few parameters that need to be provided but are not strictly useful to play around with.
# this just cleans things up a bit so the code below highlights the important knobs
easy_generate = partial(generate_diffusion_cond, sample_size=sample_size, sigma_min=0.3, sigma_max=500, device=device)
Next we can define our conditioning, which for the default Stable Audio Open involves text, timing, and overall length.
# Set up text and timing conditioning
conditioning = [{
"prompt": "clean guitar, sweep picking, 140 bpm, G minor",
"seconds_start": 0, # this says "where" in time the sample is in the song,
"seconds_total": 30 # total sample length in seconds, rest gets padded with silency
}]
seed = 1000
n_steps = 50
cfg = 7.5
sampler = "dpmpp-3m-sde"
output = easy_generate(
model,
conditioning=conditioning,
steps=n_steps, # number of diffusion steps to run
cfg_scale=cfg, # classifier free guidance guidance scale
sampler_type=sampler, # sampling "algorithm", check out https://github.com/Stability-AI/stable-audio-tools/blob/main/stable_audio_tools/inference/sampling.py#L177 for more options
seed=seed,
)
# Rearrange audio batch to a single sequence
output = rearrange(output, "b d n -> d (b n)")
# Peak normalize, clip, convert to int16, and save to file
output = output.to(torch.float32).div(torch.max(torch.abs(output))).clamp(-1, 1).mul(32767).to(torch.int16).cpu()[:, :round(conditioning[0]['seconds_total']*sample_rate)]
Now we can listen to the output! Note: if running on a collab notebook, rendering audio will stop the autosave feature, so be sure to delete the block outputs if you want to turn this back on!
ipd.display(ipd.Audio(output, rate=sample_rate))